lov asz extension
Minimisation of Submodular Functions Using Gaussian Zeroth-Order Random Oracles
Farzin, Amir Ali, Pun, Yuen-Man, Braun, Philipp, Summers, Tyler, Shames, Iman
We consider the minimisation problem of submodular functions and investigate the application of a zeroth-order method to this problem. The method is based on exploiting a Gaussian smoothing random oracle to estimate the smoothed function gradient. We prove the convergence of the algorithm to a global $ε$-approximate solution in the offline case and show that the algorithm is Hannan-consistent in the online case with respect to static regret. Moreover, we show that the algorithm achieves $O(\sqrt{NP_N^\ast})$ dynamic regret, where $N$ is the number of iterations and $P_N^\ast$ is the path length. The complexity analysis and hyperparameter selection are presented for all the cases. The theoretical results are illustrated via numerical examples.
A Fixed Point Framework for the Existence of EFX Allocations
We consider the problem of the existence of an envy-free allocation up to any good (EFX) for linear valuations and establish new results by connecting this problem to a fixed point framework. Specifically, we first use randomized rounding to extend the discrete EFX constraints into a continuous space and show that an EFX allocation exists if and only if the optimal value of the continuously extended objective function is nonpositive. In particular, we demonstrate that this optimization problem can be formulated as an unconstrained difference of convex (DC) program, which can be further simplified to the minimization of a piecewise linear concave function over a polytope. Leveraging this connection, we show that the proposed DC program has a nonpositive optimal objective value if and only if a well-defined continuous vector map admits a fixed point. Crucially, we prove that the reformulated fixed point problem satisfies all the conditions of Brouwer's fixed point theorem, except that self-containedness is violated by an arbitrarily small positive constant. To address this, we propose a slightly perturbed continuous map that always admits a fixed point. This fixed point serves as a proxy for the fixed point (if it exists) of the original map, and hence for an EFX allocation through an appropriate transformation. Our results offer a new approach to establishing the existence of EFX allocations through fixed point theorems. Moreover, the equivalence with DC programming enables a more efficient and systematic method for computing such allocations (if one exists) using tools from nonlinear optimization. Our findings bridge the discrete problem of finding an EFX allocation with two continuous frameworks: solving an unconstrained DC program and identifying a fixed point of a continuous vector map.
Online Nonsubmodular Optimization with Delayed Feedback in the Bandit Setting
Yang, Sifan, Wan, Yuanyu, Zhang, Lijun
We investigate the online nonsubmodular optimization with delayed feedback in the bandit setting, where the loss function is $α$-weakly DR-submodular and $β$-weakly DR-supermodular. Previous work has established an $(α,β)$-regret bound of $\mathcal{O}(nd^{1/3}T^{2/3})$, where $n$ is the dimensionality and $d$ is the maximum delay. However, its regret bound relies on the maximum delay and is thus sensitive to irregular delays. Additionally, it couples the effects of delays and bandit feedback as its bound is the product of the delay term and the $\mathcal{O}(nT^{2/3})$ regret bound in the bandit setting without delayed feedback. In this paper, we develop two algorithms to address these limitations, respectively. Firstly, we propose a novel method, namely DBGD-NF, which employs the one-point gradient estimator and utilizes all the available estimated gradients in each round to update the decision. It achieves a better $\mathcal{O}(n\bar{d}^{1/3}T^{2/3})$ regret bound, which is relevant to the average delay $\bar{d} = \frac{1}{T}\sum_{t=1}^T d_t\leq d$. Secondly, we extend DBGD-NF by employing a blocking update mechanism to decouple the joint effect of the delays and bandit feedback, which enjoys an $\mathcal{O}(n(T^{2/3} + \sqrt{dT}))$ regret bound. When $d = \mathcal{O}(T^{1/3})$, our regret bound matches the $\mathcal{O}(nT^{2/3})$ bound in the bandit setting without delayed feedback. Compared to our first $\mathcal{O}(n\bar{d}^{1/3}T^{2/3})$ bound, it is more advantageous when the maximum delay $d = o(\bar{d}^{2/3}T^{1/3})$. Finally, we conduct experiments on structured sparse learning to demonstrate the superiority of our methods.
- Asia > China > Jiangsu Province > Nanjing (0.05)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Zhejiang Province > Hangzhou (0.04)